Chapter 3.1. Introduction to Contemporary Dynamic Music Techniques

Chapter 2 summarizes how the history of video game development has led to current practices. Chapter 3 reviews current practices by reviewing the literature and games created by industry professionals. Chapter 4 uses the knowledge from Chapters 2 and 3 to analyze DMSs.

A problem I’ve encountered in my research about dynamic music composition techniques is that different composers and researchers identify, interpret, and describe techniques differently. In the pursuit of describing a novel function new terms are invented for old concepts, and words with different meanings become synonymous with each other. I do not intend to adequately address semantics within the field, but I do attempt to clarify various terms that may cause confusion. I will discuss the variety of terms found in the literature and use my best judgment to assess, categorize, and consolidate them in a way that describe unique functions.

I am grouping DMS techniques within two primary categories; vertical layering, and horizontal resequencing. Within the vertical layers category there are a number techniques that possess unique functions. Within the horizontal resequencing category there are a variety of terms used to describe how DMSs are designed to unfold through time. To summarize the two, vertical layering involves the layering of assets to add, subtract, replace, or alter the current musical material. Horizontal resequencing involves slicing music assets into short segments which can loop, branch, or transition to other segments in a variety of different sequences. Horizontal resequencing often dictates the structural arc or branching behavior that the music will unfold through.

Below is a list of musical components that dynamic music techniques are able to alter:

Variable tempo
Variable pitch
Variable rhythm
Variable volume/dynamics
Variable DSP (Digital Signal Processing)/timbres
Variable melodies (algorithmic generation)
Variable harmony (chordal arrangements, key or mode)
Variable mixing
Variable form (open form)
Variable form (branching)

​(Collins, 2008, 147-165)

Through the investigation of these components Collins is able to describe the dynamic capability of various types of video game music (Collins, 2008). Collins does not seek to label or categorize specific composition techniques, but instead describes the functions of the music in context of gameplay. Winifred Phillips prescribes a similar approach in her book “A Composer’s Guide to Game Music” (Phillips, 2014, 185-202), as techniques are only categorized as horizontal or vertical functions. Further detail is articulated by describing how the music behaves in-game. I also employ this type of categorization method (as opposed to assigning unique terms to subtle variances of a primary technique).

__________________________________________________



Chapter 3.2. Vertical Layering Techniques

Vertical layering techniques are used to add, subtract, or interchange the current musical material. Vertical layering techniques offer the potential to have a limitless number of orchestrations or re-harmonizations for a track. The reactiveness of vertical layering techniques are easily perceptible to the player, as there is a sense of immediacy in how the music is able to change at a moment’s notice. Changing musical material through the use of vertical layering techniques can be impactive for communicating sudden change. However, the potential for immediate musical change may influence the player’s expectation of the music’s structural development.

A player is exploring a forest to the soundtrack of a serene string orchestra. The player is immersed in the music, expecting the lyrical melodies and lush harmonies to sweep them through the beautiful landscape. When suddenly a bear attacks. The music quickly ramps up by adding additional layers of percussion, replacing consonant harmonies with dissonance, and replacing the string orchestra with a brass choir. The change may be very sudden and unexpected, possibly interrupting the middle of a serene musical phrase. After building tension through the climactic bear fight scene, the composer must use caution in how they resolve the music and return to the more serene state. Does the music ever return to the serene state? Does it happen after a specific amount of time? Or does the combat orchestration and re-harmonization fade out and the serene forest music suddenly start again? These questions represent the challenges that composers must work around when using vertical layering techniques.

Sweet offers a warning about using vertical layering techniques:

“…composers forget about how music is shaped—not just with parts, but through how those parts change over time. Rarely does a single instrument play through an entire piece of music…Unfortunately, a game can’t duplicate the composer’s unique ability to create an emotional story through music by just fading in and out layers. Each layer should still have a complete musical thought containing dynamics and emotional levels…The most effective scores that use vertical remixing start by being an excellent piece of music that has dynamics, swells in intensity, and offers harmonic changes and everything that a good piece of music has, before splitting it up into layers” (Sweet, 2015, 163).


Sweet refers to vertical layering techniques as a type of vertical remixing (Sweet, 2015). Another common synonym is the term vertical re-orchestration. I have chosen to use the term vertical layering because I believe it has no implications other than the layering of music, and is broad enough to compartmentalize a variety of unique techniques within it. The term vertical layering has been used by other composers and researchers as well (Crowley, 2015; Phillips, 2014).



3.2.1. - Interchangeable Layers
The function of interchangeable layers is to allow different layers of music to replace one another by fading in and out. The replacement may contain new harmonic, melodic, or rhythmic material, or a variation of orchestration. Thomas states that when discussing music layers, “game composers are describing a group of synchronized music tracks that play back simultaneously, with one or more sounding while all others are muted” (Thomas, 2016, 109). Sweet refers to this technique as individually controllable layers, and distinguishes its uniqueness from other techniques in that each layer of music is controlled by its own game parameter (Sweet, 2015, 159). Phillips refers to this as the interchange technique (Phillips, 2014, 195). Stevens and Raybould call it parallel forms (Stevens and Raybould, 2016, 131). Below are three examples (1-3).

1. Interchangeable layers are demonstrated in “Final Fantasy XV” (Square Enix, 2016) at Hammerhead Diner. The music playing outside of the diner is played by an acoustic ensemble with two acoustic guitars, harmonica, organ, and tambourine. When the player enters the diner the instruments are replaced. The organ remains, but is much quieter in the mix. The acoustic guitars become electric guitars, the tambourine becomes a full drum kit, and an electric bass is added. The structure and musical material is the same. However, the melody and counterpoint are embellished to be more stylistically appropriate to the new instrumentation.





The above illustration shows how the two tracks exist in parallel and how the crossfade occurs at the point of entering or exiting the diner.



2. Interchangeable layers can also function with different layers being controlled individually at the same time (hence, Sweet’s term of individually controllable layers). Some musical layers may fade in and out, while others remain the same. This is demonstrated in my Mini-game #1, a game that I made in collaboration with Phi Dinh.

Mini-game #1 - Windows
Download Mini-game #1.zip, unzip it, then launch it

Mini-game #1 - Mac
Download Mini-game #1.app.zip, unzip it, then launch it



The above illustration shows how interchangeable layers can be controlled individually. Each layer’s audibility depends on the player’s actions. Seven different layers are used to substitute one another altering the orchestration and intensity of the music. The logic is shown in the diagram. For the sake of simplicity in illustrating the interchangeability of layers, I have left out the part of the system’s logic that dictates the re-muting of layers when moving back from a color previously entered. For example, when moving from dark pink to green the violin melody is replaced by an electric guitar. When moving from green back to dark pink the electric guitar is replaced by the original violin melody.

3. Thomas illustrates a clever use of interchangeable layers which he calls mapped layers (Thomas, 2016, 112). The term mapped layers could be used to specifically describe how Thomas’ example functions, as a sub-categorization of the broader term, interchangeable layers. Thomas’ example shows that each layer is composed at a tempo divisible by the others, and the harmonic material of each ‘higher’ layer must become more active while fitting within the harmonic framework of the ‘lower’ layers. Rather than referring to this design as its own technique (mapped layers), I think it is more useful to describe it as a use of interchangeable layers that contain harmonic and temporal relationships. The diagram below is from Thomas’ book (used with his permission). The diagram shows the relationships between the different layers. Thomas explains that these layers replace each other as the player moves between exploring, questing, and combating (Thomas, 2016, 112-116). As one fades in, the other fades out. The layers are not designed to be added on top of each other, as that would be a different technique called additive layers, and would no longer be following the principles which I have defined to be interchangeable layers. With Thomas’ permission, I have composed music to accompany his diagram and compiled it into a video. The video shows how the music I composed transitions from one layer to the next, adhering to Thomas’ musical framework.





3.2.2. - Additive Layers
The function of additive layers is described by Thomas: “…additive layers begin with a foundational music track playing and add a new layer to the score for each new game state” (Thomas, 2016, 116). Sweet describes it as: “The additive layers technique adds music layers to the cue as the state changes, then removes them as the state reverts back again. Thus Layer 1 is generally playing all the time, and other layers are added in as the gameplay progress[es] [sic.]” (Sweet, 2015, 159).

I define additive layers as a function that allows different layers of music to be added or subtracted from a foundational base layer of music. The difference between this and interchangeable layers is that the new layers being added (faded in) do not serve as a replacement for existing layers that are already playing. Thomas’ earlier example of using interchangeable layers that are harmonically and temporally nested within each other (Example 3) are designed in a way that each layer replaces the existing one (as one fades in, the other fades out). However, if those layers were designed to be added and subtracted (as opposed to being interchanged) from the foundational layer, then that example would be classified as being an additive layering technique. Below are three examples (4-6).

4. Additive Layers are demonstrated in “Destiny” (Bungie, 2014), in the game mode called Rift, where players compete against each other to achieve an objective. Players must compete to obtain possession of an item called a Spark. The team that obtains the Spark must take it to a zone defended by the opposing team. This zone is called the Rift. The team with the Spark has a limited amount of time to deliver it to the Rift. During that time the defending team is trying to eliminate the player that is in possession of the Spark to end the attacking team’s attempt at scoring a point. As the player holding the Spark moves closer towards the rift the music intensifies by adding additional layers of music, and as the player moves away those layers are subtracted. All players can hear this music. The intensity grows as additional layers are added. If the attacking team scores, or if the player holding the spark is eliminated, the music concludes.





The above illustration shows how the system functions. As the Spark gets closer to the Rift new layers of music are added.

5. Additive Layers do not have to function in a specific sequence, where layer one plays, then layer two is added, then layer three, etc. It may be possible for gameplay to warrant layer three being added before two, or for layer two to be subtracted before layer three. This is demonstrated in Mini-game #2, a game that I made in collaboration with Phi Dinh.

Mini-game #2 - Windows
Download Mini-game #2.zip, unzip it, then launch it

Mini-game #2 - Mac
Download Mini-game #2.app.zip, unzip it, then launch it



The above illustration identifies the different layers of music. All layers begin at the same time, however only layer one is unmuted, the rest are inaudible. As the player navigates to different zones the different layers are added or subtracted to change the intensity and complexity of the music. In this particular example the layers are designed to increase the harmonic complexity, as well as create the feeling of an increased tempo. In some ways, this is very similar to Example 3, where Thomas has mapped interchangeable layers to have harmonic and temporal relationships with each other. The key difference is that instead of replacing the layers, they are being added or subtracted from the base layer.

6. Another use of additive layers may be to build up several layers of musical material to create a dramatic swell. This is demonstrated in Mini-game #3, again made in collaboration with Phi Dinh. This example is similar to the Destiny example, however, Mini-game #3 uses the principle of additive layers with a structured, non-looping, piece of music. It contains many layers (each with a different grouping of instruments that do not play through the entire piece), and the player’s movement changes the music’s intensity and complexity relative to the linearity of the music. For example, trying to ramp up intensity at the very beginning of the track may add a couple of layers, but will not suddenly result in the feeling of climax. Whereas trying to ramp up intensity when the music’s structure seems to be moving towards the climax will result in a dramatic swell. The structure of the linear music is felt regardless of the player’s movement. However, the intensity and complexity of the structure is dictated by the player.

Mini-game #3 - Windows
Download Mini-game #3.zip, unzip it, then launch it

Mini-game #3 - Mac
Download Mini-game #3.app.zip, unzip it, then launch it



The above illustration identifies the different layers of music and describes the logic that the system is controlled by. As the player moves upwards towards the End Zone more layers of music are added. As the player moves downward towards the Start Zone, layers of music are subtracted. Not all layers of music play through the entire piece.



3.2.3. - Stingers
Thomas defines a stinger as “a short burst of music created to match a specific game event…” (Thomas, 2016, 98). A stinger can be triggered to occur from silence or it can be added to existing musical material. When it is added onto existing music layers it is referred to by Stevens and Raybould as a type of ornamental form (Stevens and Raybould, 2016, 131). Triggering a stinger is unlike previous vertical layering techniques in that a stinger does not play in a muted state while it is not being used. Additive and interchangeable layers work by triggering all of the different music layers at the same time, but leaving them muted until they are required to be heard (at which point they are faded in). However, stingers are triggered to happen at a moments notice. The composer must be aware that, if not treated correctly, a stinger could occur out of time with the rest of the music or create dissonance against the present harmony. While it is not always required to take tempo, rhythm, or harmony into consideration, it is possible. This can be done by using specific implementation features in tools like Wwise, FMOD, Unity, or UE4. Stevens and Raybould refer to musically aware stingers as being either rhythmically aware, harmonically appropriate, or both (Stevens and Raybould, 2016, 391-395).

7. Stingers are used to suddenly create tension in the game “Sekiro: Shadows Die Twice” (From Software, 2019) when the player engages in combat. The stingers in Sekiro are often heard as percussive hits and musical vocal exclamations (not to be confused with the enemies’ combat vocalizations). The stingers always occur on strong beats of the existing music (when the music has a defined beat). The stingers also assist in smoothening out the transition from one track to the next (non-combat to combat). The transition between tracks in this example does not follow the previously described vertical layering scheme. Instead, this DMS functions by stopping one track and starting the next while simultaneously layering the stinger on top at the moment of transition. The stinger is a vertical layering technique. However, the way the tracks underneath it move from one to the next is called horizontal resequencing and is described in the next section (Chapter 3.3). The illustration of this example includes the horizontal resequencing technique.



__________________________________________________



Chapter 3.3. Horizontal Resequencing Techniques

Horizontal resequencing techniques change the musical material by stopping an existing track, and starting a new one. Unlike vertical layering techniques, horizontal resequencing does not involve playing multiple compatible stems at the same time while fading them in and out. One of the most common methods for executing horizontal resequencing is to slice the music into smaller segments. The beginning of each segment acts as an entry point to the musical track, and the end of each segment acts as an exit point. The smaller these chucks are the more adaptive the score can be. Whitmore describes how smaller chunks are able to more fluidly connect changes in gameplay, while longer segments may be too slow due to fewer exit points.

"If it’s just linear files in a game, the larger those chunks are the less they move with the visuals in the game, the more disconnect is going to be from the experience. Also, the more the likelihood the player would say ‘I’m just going to play something from my jukebox!’. If all one’s getting is a 5-minute piece that loops then there’s nothing interesting there…There has to be some degree of adaptability in the music which works with the level of adaptability of the game” (Velardo, 2017).


Whitmore describes a horizontal resequencing system as a transitional matrix (Brandon, 2002). Collins refers to this system as a variable branching form (Collins, 2008, 147-165). Sweet calls it a branching score (Sweet, 2015). And Folmann describes this approach as micro-scoring. Folmann explains:

”For Tomb Raider: Legend, we spent a long time creating a highly advanced proprietary streaming system that allows us to trigger micro-scores all over the game world. So, essentially, I can place scores for any change in the game, which is naturally a complex and time consuming process. The trend of games – particularly next-generation 360 and PS3 – is one of complexity. Everything is getting more detailed, whether its multiple translucent layers of textures, real-time generated light and shadow maps, massive streaming game worlds and so forth. Audio and music is no exception. The need for dissecting music into smaller fractions is becoming increasingly important in order to support the decisions and experiences of the player" (Latta, 2006).


As a side note, Chapter 4 will reveal that despite Folmann’s ideas about the growing complexity of dynamic music in video games, the industry has instead been relatively complacent with basic DMS designs.

An alternative to slicing a longer track into smaller audio files is to place markers across the duration of the longer track. The audio engine can be made to recognize each marker as an entry or exit point. Horizontal resequencing techniques function the same regardless of slicing or marking, however, there are some technical issues regarding preparation, implementation, and file organization (Stevens and Raybould, 2016,178-182). Systems that use marking may require a crossfade between the two points. This is sometimes referred to as a crossfading score (Sweet, 2015, 145) or a transitional form (Stevens and Raybould, 2016,178).

Regardless of whether the system uses slicing or marking, the function of horizontal resequencing remains; that is to change the musical material that is playing by stopping an existing segment and starting a new one. This is fundamentally different from vertical layering.

A composer may decide to loop sections of music until the player triggers a change, at which point one track will stop (when it reaches the next exit point), and a new one will begin (at a specified entry point). It is possible to state that specific exit points must lead to specific entry points. Sometimes an additional musical segment must be written to transition and bridge the two points. Depending on the gameplay a transition could be relatively long. Through this transition the composer has an opportunity to modulate to a new key or gradually change the tempo of a track. This means that the two tracks could be quite different, and the transition between the two could be seamless.

The variations in terminology used to describe this technique is often used loosely, which creates confusion when trying to understand the similarities and differences between different techniques that composers employ. To understand the fundamentals it is irrelevant whether the composer is trying to create a system that randomly sequences segments of music to create a continuous track (referred to as swappable chunks (Thomas, 2016, 120)), or if the composer is stringing together a complex web of musical narrative pathways creating a transitional matrix. The concept of both are the same. That is to take a collection of tracks, create entry and exit points, dictate which exit points lead to which entry points, and if needed, write transitional segments to bridge the two. Below are two examples (8, 9).

8. “The Elder Scrolls V: Skyrim” (Bethesda, 2011), uses horizontal resequencing to change between the music tracks that are playing depending on what the player is doing. Below is an example that shows how changing locations results in musical change. Without further investigation on my part, it seems that there are many tracks suitable for each location or situation the player enters. Of the suitable tracks it seems random as to which one is played. This type of randomization of tracks is described by Thomas as a music set (Thomas, 2016, 119). In Skyrim the transition between tracks is achieved by crossfading or by stopping and starting. At times it seems the music is slow to change to the player’s situation, which may indicate that the system’s exit points are not very close to one another. To re-quote Whitmore’s earlier statement, “…the larger those chunks are the less they move with the visuals in the game, the more disconnect is going to be from the experience. Also, the more the likelihood the player would say ‘I’m just going to play something from my jukebox!’” (Velardo, 2017).



9. A complex horizontal resequencing system can be experienced in Mini-game #4, which I made in collaboration with Phi Dinh. Mini-game #4 contains two zones, one is blue and the other is pink. When the player moves from the blue zone to the pink, the music transitions from the blue music to the pink music. When the player moves from the pink zone to the blue, the music transitions back. The game shows a cooldown timer that signifies the time remaining before the player’s actions are registered once again by the game engine. The timer exists because the simplicity of this particular game allows the player to rapidly move back and forth, potentially glitching how segments are queued. It is likely that in an actual game the player will not be given a situation in which this type of behavior is possible.

The system not only includes two tracks (blue and pink) which are sliced into many small segments, but also includes many transitional segments to help smoothly transition between them.

Mini-game #4 - Windows
Download Mini-game #4.zip, unzip it, then launch it

Mini-game #4 - Mac
Download Mini-game #4.app.zip, unzip it, then launch it


Click to Enlarge

Download Mini-game #4 data table

The above diagram shows how the DMS functions. Each module represents an individual audio file. The lines show how they can transition to and from each other. The two tracks transition between each other via the transitional segments. Many of the transitional segments are able to transition back to the original track (blue or pink) via secondary transition segments. The illustration of the DMS visually shows each track multiple times to allow for this type of behavior to be shown without cluttering the diagram. The spreadsheet of data was used during my collaboration with Phi. It shows the event names used to implement the music, descriptions of how each segment functions as horizontal resequencing, and a box of text that we used to leave notes to each other about bugs.

__________________________________________________



The examples above illustrate systems that are accepted as a standard in video game development. Combinations of vertical and horizontal techniques are not uncommon, but are often simple. Perhaps this is because composers approach video game music wanting to achieve the goal of supporting the player’s moment-to-moment experiences, and are not trying to find solutions to the structural problems that are created as a result of that pursuit. Chapter 4 will investigate this in more depth, reviewing and proposing different methods for analyzing the techniques used in DMSs.

__________________________________________________


Previous: Chapter 2: History and Development of Game Music
Next: Chapter 4: DMS Analysis
Contents: Back to Table of Contents